178 research outputs found
Generative Compression
Traditional image and video compression algorithms rely on hand-crafted
encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the
data being compressed. Here we describe the concept of generative compression,
the compression of data using generative models, and suggest that it is a
direction worth pursuing to produce more accurate and visually pleasing
reconstructions at much deeper compression levels for both image and video
data. We also demonstrate that generative compression is orders-of-magnitude
more resilient to bit error rates (e.g. from noisy wireless channels) than
traditional variable-length coding schemes
Are Lock-Free Concurrent Algorithms Practically Wait-Free?
Lock-free concurrent algorithms guarantee that some concurrent operation will
always make progress in a finite number of steps. Yet programmers prefer to
treat concurrent code as if it were wait-free, guaranteeing that all operations
always make progress. Unfortunately, designing wait-free algorithms is
generally a very complex task, and the resulting algorithms are not always
efficient. While obtaining efficient wait-free algorithms has been a long-time
goal for the theory community, most non-blocking commercial code is only
lock-free.
This paper suggests a simple solution to this problem. We show that, for a
large class of lock- free algorithms, under scheduling conditions which
approximate those found in commercial hardware architectures, lock-free
algorithms behave as if they are wait-free. In other words, programmers can
keep on designing simple lock-free algorithms instead of complex wait-free
ones, and in practice, they will get wait-free progress.
Our main contribution is a new way of analyzing a general class of lock-free
algorithms under a stochastic scheduler. Our analysis relates the individual
performance of processes with the global performance of the system using Markov
chain lifting between a complex per-process chain and a simpler system progress
chain. We show that lock-free algorithms are not only wait-free with
probability 1, but that in fact a general subset of lock-free algorithms can be
closely bounded in terms of the average number of steps required until an
operation completes.
To the best of our knowledge, this is the first attempt to analyze progress
conditions, typically stated in relation to a worst case adversary, in a
stochastic model capturing their expected asymptotic behavior.Comment: 25 page
A Complexity-Based Hierarchy for Multiprocessor Synchronization
For many years, Herlihy's elegant computability based Consensus Hierarchy has
been our best explanation of the relative power of various types of
multiprocessor synchronization objects when used in deterministic algorithms.
However, key to this hierarchy is treating synchronization instructions as
distinct objects, an approach that is far from the real-world, where
multiprocessor programs apply synchronization instructions to collections of
arbitrary memory locations. We were surprised to realize that, when considering
instructions applied to memory locations, the computability based hierarchy
collapses. This leaves open the question of how to better capture the power of
various synchronization instructions.
In this paper, we provide an approach to answering this question. We present
a hierarchy of synchronization instructions, classified by their space
complexity in solving obstruction-free consensus. Our hierarchy provides a
classification of combinations of known instructions that seems to fit with our
intuition of how useful some are in practice, while questioning the
effectiveness of others. We prove an essentially tight characterization of the
power of buffered read and write instructions.Interestingly, we show a similar
result for multi-location atomic assignments
The SkipTrie: low-depth concurrent search without rebalancing
To date, all concurrent search structures that can support predecessor queries have had depth logarithmic in m, the number of elements. This paper introduces the SkipTrie, a new concurrent search structure supporting predecessor queries in amortized expected O(log log u + c) steps, insertions and deletions in O(c log log u), and using O(m) space, where u is the size of the key space and c is the contention during the recent past. The SkipTrie is a probabilistically-balanced version of a y-fast trie consisting of a very shallow skiplist from which randomly chosen elements are inserted into a hash-table based x-fast trie. By inserting keys into the x-fast-trie probabilistically, we eliminate the need for rebalancing, and can provide a lock-free linearizable implementation. To the best of our knowledge, our proof of the amortized expected performance of the SkipTrie is the first such proof for a tree-based data structure.National Science Foundation (U.S.) (Grant CCF-1217921)United States. Dept. of Energy. Office of Advanced Scientific Computing Research (Grant ER26116/DE-SC0008923)Oracle CorporationIntel Corporatio
Inherent Limitations of Hybrid Transactional Memory
Several Hybrid Transactional Memory (HyTM) schemes have recently been
proposed to complement the fast, but best-effort, nature of Hardware
Transactional Memory (HTM) with a slow, reliable software backup. However, the
fundamental limitations of building a HyTM with nontrivial concurrency between
hardware and software transactions are still not well understood.
In this paper, we propose a general model for HyTM implementations, which
captures the ability of hardware transactions to buffer memory accesses, and
allows us to formally quantify and analyze the amount of overhead
(instrumentation) of a HyTM scheme. We prove the following: (1) it is
impossible to build a strictly serializable HyTM implementation that has both
uninstrumented reads and writes, even for weak progress guarantees, and (2)
under reasonable assumptions, in any opaque progressive HyTM, a hardware
transaction must incur instrumentation costs linear in the size of its data
set. We further provide two upper bound implementations whose instrumentation
costs are optimal with respect to their progress guarantees. In sum, this paper
captures for the first time an inherent trade-off between the degree of
concurrency a HyTM provides between hardware and software transactions, and the
amount of instrumentation overhead the implementation must incur
Neutrinoless double-beta decay with massive scalar emission
Searches for neutrino-less double-beta decay () place an
important constraint on models where light fields beyond the Standard Model
participate in the neutrino mass mechanism. While experimental
collaborations often consider various massless majoron models, including
various forms of majoron couplings and multi-majoron final-state processes,
none of these searches considered the scenario where the "majoron" is
not massless, ~MeV, of the same order as the -value of the
reaction. We consider this parameter region and estimate
constraints for of order MeV. The constraints are
affected not only by kinematical phase space suppression but also by a change
in the signal to background ratio characterizing the search. As a result,
constraints for diminish significantly below the
reaction threshold. This has phenomenological implications, which we illustrate
focusing on high-energy neutrino telescopes. Our results motivate a dedicated
analysis by collaborations, analogous to the dedicated analyses
targeting massless majoron models.Comment: 9 pages, 6 figures. v2: added App.A w/ phase space integrals, a few
added comments, match journal versio
Towards consistency oblivious programming
15th International Conference, OPODIS 2011, Toulouse, France, December 13-16, 2011. ProceedingsIt is well known that guaranteeing program consistency when accessing shared data comes at the price of degraded performance and scalability.
This paper initiates the investigation of consistency oblivious programming (COP). In COP, sections of concurrent code that meet certain criteria are executed without checking for consistency. However, checkpoints are added before any shared data modification to verify the algorithm was on the right track, and if not, it is re-executed in a more conservative and expensive consistent way. We show empirically that the COP approach can enhance a software transactional memory (STM) framework to deliver more efficient concurrent data structures from serial source code. In some cases the COP code delivers performance comparable to that of more complex fine-grained structures
- …